CSC 453

Winter 2026 Day 4

Admin

  • Quiz 2 due Friday
  • Lab 2 due Monday
  • SLOsh due Monday
  • No class/lab Tuesday

Process management

Questions to consider

  • Which system calls are related to process management and lifecycles?
  • How does the process hierarchy work?
  • What are zombies and orphans? Why do zombies exist?

UNIX process APIs

  • fork() creates a new child process
    • All processes are created by forking from a parent
    • The init process is ancestor of all processes
      • Run pstree in a terminal to see
  • exec() makes a process execute a given executable (effectively replaces the process)
  • exit() terminates a process
  • wait() causes a parent to block until child terminates
  • Many variants exist of the above system calls with different arguments

What happens during a fork()?

  • A new process is created by making a copy of parent’s memory image
  • Both parent and child have unique address spaces (isolated from each other, allowing for independent processing)
  • The new process is added to the OS process list and scheduled
  • Parent and child start execution just after fork (with different return values)
  • Parent and child execute and modify the memory data independently

Process management

Process creation

  • Different execution models
    • Parent & child may execute independently
    • Parent may wait for child
    • Child may create more children (Process hierarchies)
    • Parent may kill children
  • Child often invokes exec() to change its memory image to a new program
  • Why two steps (fork() then exec())?
    • Allows the child to change file descriptors and other settings before exec()

Process destruction

  • Some operating systems do not allow child to exist if its parent has terminated. If a process terminates, then all its children must also be terminated

    • Cascading termination: All children, grandchildren, etc., are terminated.
    • The termination is initiated by the operating system
  • The parent process may wait for termination of a child process by using the wait() system call. The call returns status information and the pid of the terminated process

    pid = wait(&status); 

Zombies and orphans

  • If no parent waiting (did not invoke wait()), and process completes, process is a zombie
    • Zombie = dead but not yet reaped (exit status hasn’t been read)
    • Still has an entry in the process table
    • We need zombies: so the kernel can preserve a child’s exit status until the parent calls wait(), even if the child exits first
  • If parent terminated without invoking wait(), process is an orphan
    • Orphan = alive but parent is gone
    • init benevolently adopts orphans

What isn’t clear?

Comments? Thoughts?

Modern process isolation

Questions to consider

  • What are namespaces and cgroups?
  • How do they differ from virtual machines?
  • How do containers use them to isolate processes?

Isolation without full virtualization

  • Virtual machines provide complete isolation by emulating entire hardware + OS
    • Heavier resource overhead (multiple OS instances)
    • Better isolation between workloads
  • Containers provide lightweight isolation using OS-level mechanisms
    • Share the same kernel
    • Much lower overhead than VMs
    • Linux kernel provides the building blocks: namespaces and cgroups

Namespaces: logical isolation

  • Namespaces partition global system resources so they appear as separate isolated instances
  • Each process belongs to a namespace and only sees resources in that namespace
  • Types of namespaces:
    • PID: process IDs (what processes can a process see?)
    • Network: network interfaces, ports, routing tables
    • Mount: filesystem mounts (what can a process access?)
    • IPC: IPC objects, message queues
    • UTS: hostname and domain name
    • User: user and group IDs (who owns the process?)

Namespaces example

  • Two processes in separate PID namespaces think they are PID 1 (init)
  • Each sees only processes within their own namespace
  • From the host OS perspective, they have different global PIDs
  • Enables the illusion that each container has its own isolated process tree

Seeing namespaces in action

  • View namespaces a process belongs to:

    ls -l /proc/self/ns/
  • Use unshare to create a new PID namespace:

    unshare -pf --mount-proc bash      # creates new PID namespace, your shell is PID 1
    ps aux               # only sees processes in this namespace
    exit                 # back to host namespace
    ps aux               # will show all processes on the host
  • Compare namespace inodes before and after (same inode = same namespace):

    ls -i /proc/self/ns/pid
    unshare -pf --mount-proc bash -c 'ls -i /proc/self/ns/pid'  # different inode

Cgroups: resource limits

  • cgroups (control groups) limit, prioritize, and account for resource usage of process groups
  • Key capabilities:
    • CPU limits: restrict how much CPU time a group can use
    • Memory limits: cap memory usage; OOM killer invoked if exceeded
    • I/O limits: restrict disk I/O bandwidth
    • Device access: restrict which devices a process can access
  • All processes in a cgroup share the same resource limitations

Seeing cgroups in action

  • View what cgroup a process belongs to:

    cat /proc/self/cgroup
  • Check your current limits:

    cat /proc/self/limits  # shows per-process limits (some enforced by cgroups)
  • In practice, cgroups are invisible to users, kernel enforces limits automatically when a process exceeds allocated resources

Cgroups vs. namespaces

  • Namespaces: about visibility—what can a process see?
    • Logical isolation of resources
  • cgroups: about limits—how much can a process use?
    • Resource accounting and enforcement
  • Together: processes appear isolated AND are prevented from consuming excessive resources

Containers as a building block

  • cgroups and namespaces are mechanisms, containers allow us to apply policy through orchestration
  • Containers (e.g. Docker) combine namespaces + cgroups + layered filesystems
  • Results in lightweight, portable process isolation
  • Single kernel, multiple isolated environments
  • Much cheaper than VMs, but with less isolation guarantees

What isn’t clear?

Comments? Thoughts?

Process communication

Questions to consider

  • What are the two main strategies for IPC?
  • How do they differ?
  • In what situations would you choose one over the other?

Processes give us a protection boundary

  • The operating system is responsible for isolating processes from each other
  • What you do in your own process is your own business but it shouldn’t be able to crash the machine or affect other processes, or at least processes started by other users
  • Thus: safe intra-process communication is your problem; safe inter-process communication is an operating system problem

Why do we need IPC? What are the benefits?

  • Data Sharing: IPC allows processes to share data efficiently, which is crucial for applications requiring real-time data exchange
  • Modularity: It promotes modularity by enabling different parts of a system to communicate, making the system easier to manage and scale
  • Resource Utilization: IPC can help optimize resource utilization by allowing processes to coordinate their use of shared resources
  • Concurrency (scalability): It supports concurrent execution of processes, improving the overall performance and responsiveness of applications

What are the disadvantages of IPC?

  • Complexity: Implementing IPC can add complexity to the system, requiring careful design and management to avoid issues like deadlocks and race conditions
  • Overhead: IPC mechanisms can introduce overhead, potentially impacting performance, especially if the communication is frequent or involves large amounts of data
  • Security: Ensuring secure IPC can be challenging, as it involves protecting data from unauthorized access and ensuring the integrity of the communication
  • Debugging: Debugging IPC-related issues can be difficult, as problems may arise from interactions between multiple processes, making them harder to isolate and resolve

What are the two main categories of IPC?

  • Message passing
    • High-level abstraction for exchanging packets of information over some interconnect
  • Shared memory
    • Region of memory available to different processes; writable by at least one process

Message passing

  • Kernel establishes and oversees all communication
    • Process copies data to buffer, then issue system call to request transfer
    • Kernel copies data into its memory
    • Later, process issues system call to retrieve
  • Two primitives: send() and recv()
  • Beyond intra-computer communication, facilitates processes over a network; link implementation is unimportant

Pros and cons of message passing?

  • Pros:
    • Easier to implement and manage, especially in distributed systems
    • Provides clear boundaries between processes, enhancing security and modularity
  • Cons:
    • Can introduce overhead due to the need for message formatting and transmission
    • May be slower compared to shared memory for large volumes of data

Shared memory

  • Kernel plays a role in establishing and attaching the address space, but does not control read/write access beyond that
  • How the memory is shared, and kept consistent, is left up to the processes

Pros and cons of shared memory?

  • Pros:
    • Offers high-speed data exchange, as processes can directly read and write to the shared memory
    • Efficient for large volumes of data
  • Cons:
    • Requires careful synchronization to avoid conflicts and ensure data consistency
    • Can be more complex to implement and debug

Message passing vs. shared memory

  • Which do you choose?
    • If you have few messages?
    • If you have millions?
    • If you need to communicate across systems?
    • If you need in-order delivery but don’t want to code it yourself?
  • Considerations:
    • Cost to establish
    • Cost per message

“Gemini, make an image in the style of a video game pitting pipes versus shared memory”

What isn’t clear?

Comments? Thoughts?